-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Big performance change) Do not run lints that cannot emit #125116
Conversation
rustbot has assigned @michaelwoerister. Use |
These commits modify the If this was unintentional then you should revert the changes before this PR is merged. Some changes occurred in src/tools/clippy cc @rust-lang/clippy |
cc @nnethercote @Kobzol, the perf wizards. Could you please give this PR a look and tell me if there are any obvious performance issues on the filtering? |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
…r=<try> (Big performance change) Do not run lints that cannot emit Before this lint, adding a lint was a difficult matter because it always had some overhead involved. This was because all lints would run, no matter their default level, or if the user had `#![allow]`ed them. This PR changes that. This change would improve both the Rust lint infrastructure and Clippy, but Clippy will see the most benefit, as it has about 900 registered lints (and growing!) So yeah, with this little patch we filter all lints pre-linting, and remove any lint that is either: - Manually `#![allow]`ed in the whole crate, - Allowed in the command line, or - Not manually enabled with `#[warn]` or similar, and its default level is `Allow` I open this PR to receive some feedback, mainly related to performance. We have lots of `Lock`s, `with_lock` and similar functions (also lots of cloning), so the filtering performance is not the best. In an older iteration, instead of doing this in the parsing phase, we developed a visitor with the same function but without so many locks, would reverting to that change help? I'm not sure tbh.
This comment has been minimized.
This comment has been minimized.
@lqd haven't you tried something like this before? 🤔 |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
We've tried a few different things yes, and so has blyxyas -- it maybe wasn't exactly like this, but I encountered annoyances like: some slow const eval loadbearing lint that shouldn't be ignored, lints that would be allowed unexpectedly because cargo allows lints unconditionally on dependencies (arguably the most common usage, and where perf gains would show up AFAICT) but some may trigger FCWs or are required to lint on dependencies despite being allowed, et cetera. Refactoring and fixing all these were too costly compared to the gains at the time, as rustc's lints were fast enough on dependencies, also a "rarer" use-case. That being said, we've added and uplifted more lints since then, including possibly costly ones like the non local impls one, and the situation may also be different for clippy itself (but we won't see that in the perf.rlo results, only locally with the clippy dedicated commands IIUC) |
Finished benchmarking commit (cc1d40f): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results (primary 2.3%, secondary -0.5%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary 2.4%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 676.788s -> 676.098s (-0.10%) |
The benchmark doesn't check clippy, right? As lqd hinted at as well? And without splitting allow-by-default rustc lints it does nothing without clippy, so I think this just shows how much time it takes to filter them (can someone else confirm this :3c) Thus, basically nothing it seems :3 (So @blyxyas maybe the cloning is ok?) |
Yeah, the benchmarks currently doesn't check Clippy, that's why I'm currently benchmarking on a different server via SSH (A server that we got explicitly to benchmark Clippy). I'll post the results here when they arrive :) Also, it currently doesn't check builtin lints because I'm having some issues checking that. That's also part of why I decided to open the PR, maybe someone has some idea (I'll see if I can read the previous attempts by lqd, maybe I can learn something from them) EDIT: Seems like lqd hasn't pushed their attempts, I'll have to keep trying new approaches by myself. |
Okis, here are the results (Wall time, Clippy) Wall time❌ Max RSS❌ Instructions❌ Cycles[ +0.42%, +25.48%] |
Those wall times are proof that this optimization has a lot of potential, the main drawback is that the filtering / parsing code is not fast enough, so in some scenarios that I'm not really able to determine exactly what do they have in common, the optimization goes backwards. But a ~70% in Wall time, that's great and we should look more into it. |
I wouldn't draw too many conclusions from these results, they seem to be quite unstable (there is also a 190% walltime regression). Note that even for PRs that don't have large perf. impacts, we can see ~30% walltime swings even on the stable benchmarking server (https://perf.rust-lang.org/compare.html?start=9e7aff794539aa040362f4424eb29207449ffce0&end=44fa5fd39a1d2af41bd7f43bc246a5e4f6d94696&stat=wall-time&nonRelevant=true). |
I've changed the system, we're back to using visitors (I've benchmarked this new commit, it should have 0 regressions and about -0.66% improvement) @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
70e9bc2
to
ddad55f
Compare
64c914b
to
1dcfa27
Compare
Do you intend to investigate https://github.com/rust-lang/rust/pull/125116/files#r1770598991 as a follow-up? Thanks for the work anyway! |
☀️ Test successful - checks-actions |
Finished benchmarking commit (4d88de2): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (secondary -4.7%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 781.782s -> 783.187s (0.18%) |
Do not filter empty lint passes & re-do CTFE pass Some structs implement `LintPass` without having a `Lint` associated with them rust-lang#125116 broke that behaviour by filtering them out. This PR ensures that lintless passes are not filtered out.
Rollup merge of rust-lang#132637 - blyxyas:lint-less-passes, r=flip1995 Do not filter empty lint passes & re-do CTFE pass Some structs implement `LintPass` without having a `Lint` associated with them rust-lang#125116 broke that behaviour by filtering them out. This PR ensures that lintless passes are not filtered out.
…r=cjgillot (Big performance change) Do not run lints that cannot emit Before this change, adding a lint was a difficult matter because it always had some overhead involved. This was because all lints would run, no matter their default level, or if the user had `#![allow]`ed them. This PR changes that. This change would improve both the Rust lint infrastructure and Clippy, but Clippy will see the most benefit, as it has about 900 registered lints (and growing!) So yeah, with this little patch we filter all lints pre-linting, and remove any lint that is either: - Manually `#![allow]`ed in the whole crate, - Allowed in the command line, or - Not manually enabled with `#[warn]` or similar, and its default level is `Allow` As some lints **need** to run, this PR also adds **loadbearing lints**. On a lint declaration, you can use the ``@eval_always` = true` marker to label it as loadbearing. A loadbearing lint will never be filtered (it will always run) Fixes rust-lang#106983
Do not filter empty lint passes & re-do CTFE pass Some structs implement `LintPass` without having a `Lint` associated with them rust-lang#125116 broke that behaviour by filtering them out. This PR ensures that lintless passes are not filtered out.
fn lints_that_dont_need_to_run(tcx: TyCtxt<'_>, (): ()) -> FxIndexSet<LintId> { | ||
let store = unerased_lint_store(&tcx.sess); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future-compatibility lints that show up in cargo's reports are supposed to be emitted even if they are allowed in the crate. Isn't that something that this logic needs to account for somehow? Or do we account for this already in a different way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened #133108 to fix this.
…n, r=lcnr lints_that_dont_need_to_run: never skip future-compat-reported lints Follow-up to rust-lang#125116: future-compat lints show up with `--json=future-incompat` even if they are otherwise allowed in the crate. So let's ensure we do not skip those as part of the `lints_that_dont_need_to_run` logic. I could not find a current future compat lint that is emitted by a lint pass, so there's no clear way to add a test for this. Cc `@blyxyas` `@cjgillot`
Rollup merge of rust-lang#133108 - RalfJung:future-compat-needs-to-run, r=lcnr lints_that_dont_need_to_run: never skip future-compat-reported lints Follow-up to rust-lang#125116: future-compat lints show up with `--json=future-incompat` even if they are otherwise allowed in the crate. So let's ensure we do not skip those as part of the `lints_that_dont_need_to_run` logic. I could not find a current future compat lint that is emitted by a lint pass, so there's no clear way to add a test for this. Cc `@blyxyas` `@cjgillot`
lints_that_dont_need_to_run: never skip future-compat-reported lints Follow-up to rust-lang/rust#125116: future-compat lints show up with `--json=future-incompat` even if they are otherwise allowed in the crate. So let's ensure we do not skip those as part of the `lints_that_dont_need_to_run` logic. I could not find a current future compat lint that is emitted by a lint pass, so there's no clear way to add a test for this. Cc `@blyxyas` `@cjgillot`
Before this change, adding a lint was a difficult matter because it always had some overhead involved. This was because all lints would run, no matter their default level, or if the user had
#![allow]
ed them. This PR changes that. This change would improve both the Rust lint infrastructure and Clippy, but Clippy will see the most benefit, as it has about 900 registered lints (and growing!)So yeah, with this little patch we filter all lints pre-linting, and remove any lint that is either:
#![allow]
ed in the whole crate,#[warn]
or similar, and its default level isAllow
As some lints need to run, this PR also adds loadbearing lints. On a lint declaration, you can use the
@eval_always = true
marker to label it as loadbearing. A loadbearing lint will never be filtered (it will always run)Fixes #106983